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How can we search for new physics when we only vaguely know what it should look like? How 
can we perform an unbiased yet data-driven search? If we see apparently anomalous events 
in our data, how can we quantify their "interestingness" a posteriori? We present an analysis 
strategy (sleuth) that simultaneously addresses each of these questions, and we demonstrate 
its application to over thirty exclusive final states in data collected by D0 in Run I of the 
Fermilab Tevatron. 



Motivation 



It is generally recognized that the standard model, an extremely successful description of the 
fundamental particles and their interactions, must be incomplete. Although there is likely to 
be new physics beyond the current picture, the possibilities are sufficiently broad that the first 
hint could appear in any of many different guises. This suggests the importance of performing 
searches that are as model-independent as possible. 

Most recent searches for new physics have followed a well-defined set of steps: first selecting 
a model to be tested against the standard model, then finding a measurable prediction of this 
model that differs as much as possible from the prediction of the standard model, and finally 
comparing the predictions to data. This is clearly the procedure to follow for a small number 
of compelling candidate theories. Unfortunately, the resources required to implement this pro- 
cedure grow almost linearly with the number of theories. Although broadly speaking there are 
currently only three models with internally consistent methods of electroweak symmetry break- 
ing — supersymmetry, strong dynamics, and theories incorporating large extra dimensions - 
the number of specific models (and corresponding experimental signatures) is in the hundreds. 
Of these many specific models, at most one is a correct description of nature. 



In these proceedings we describe an explicit prescription for searching for the physics re- 
sponsible for stabilizing electroweak symmetry breaking, in a manner that relies only upon what 
we are sure we know about electroweak symmetry breaking: that its natural scale is on the 
order of the Higgs mass. When we wish to emphasize the generality of the approach, we say 
that it is quasi- model- independent, where the "quasi" refers to the fact that the correct model 
of electroweak symmetry breaking should become manifest at the scale of several hundred GeV. 

New sources of physics will in general lead to an excess over the expected background in 
some final state. A general signature for new physics is therefore a region of variable space in 
which the probability for the background to fluctuate up to or above the number of observed 
events is small. Because the mass scale of electroweak symmetry breaking is larger than the 
mass scale of most standard model backgrounds, we expect this excess to populate regions of 
high transverse momentum (pr). The method we will describe involves a systematic search 
for such excesses. Although motivated by the problem of electroweak symmetry breaking, this 
method is generally sensitive to any new high pj- physics. 

2 SLEUTH 

SLEUTH a quasi-model-independent prescription for searching for high pr physics beyond 
the standard model, has three components: the definitions of physical objects and exclusive final 
states; the choice of variables relevant for each final state; and an algorithm that systematically 
hunts for an excess in the space of those variables, and quantifies the likelihood of any excess 
found. We consider each in turn. 

2.1 Final states 

The data are partitioned into exclusive final states using standard criteria that identify isolated 
and energetic electrons (e), muons (/i), and photons (7), as well as jets (j), missing transverse 
energy ($t)j an d the presence of W and Z bosons. We expect the first sign of new physics to 
appear in one of these final states, but which final state it will be is anyone's guess. We analyze 
each of these final states independently. 

2.2 Variables 



For each exclusive final state, we consider a small set of variables summarized in Table 




Table 1: A quasi-model- independently motivated list of interesting variables for any final state. The set of 
variables to consider for any exclusive channel is the union of the variables in the second column for each row 

that pertains to that final state. 



If the final state includes 


then consider the variable 


one or more charged leptons 
one or more electroweak bosons 
one or more jets 


12 Pt 

EpI /w/z 



2.3 Algorithm 

The sleuth algorithm requires as input a data sample, a set of events modeling each background 
process i, and the number of background events 6j ± Sbi from each background process expected 
in the data sample. From these we determine the region 1Z of greatest excess and quantify the 



degree V to which that excess is interesting. The algorithm itself, applied to each individual 
final state, consists of seven steps: 

1. We construct a mapping from the d-dimensional variable space defined by Table |l] into the 
d-dimensional unit box (i.e., [0, l] d ) that flattens the total background distribution. We 
use this to map the data into the unit box. 

2. We define a "region" R about a set of N data points to be the volume within the unit 
box closer to one of the data points in the set than to any of the other data points in the 
sample. The arrangement of data points themselves thus determines the regions. A region 
containing N data points is called an iV-region. 

3. Each region contains an expected number of background events 6r, numerically equal to the 
volume of the region x the total number of background events expected, and an associated 
systematic error 5b r, which varies within the unit box according to the systematic errors 
assigned to each contribution to the background estimate. We can therefore compute the 
probability p N that the background in the region fluctuates up to or beyond the observed 
number of events. This probability is the first measure of the degree of interest of a 
particular region. 

4. The rigorous definition of regions reduces the number of candidate regions from infinity 
to ~ 2 Ar<lata . Imposing explicit criteria on the regions that the algorithm is allowed to 
consider further reduces the number of candidate regions. We apply geometric criteria 
that favor high values in at least one dimension of the unit box, and we limit the number 
of events in a region to fifty. The number of remaining candidate regions is still sufficiently 
large that an exhaustive search is impractical, and a heuristic is employed to search for 
regions of excess. In the course of this search, the iV-region IZn for which p^ is minimum 
is determined for each N, and pn = rnniR (pjy) is noted. 

5. In any reasonably-sized data set, there will always be regions in which the probability for 
bn to fluctuate up to or above the observed number of events is small. We determine 
the fraction P/v of hypothetical similar experiments (hse's) in which p^ found for the hse 
is smaller than p^ observed in the data by generating random events drawn from the 
background distribution and computing p^ by following steps 1-4. 

6. We define P and N m \ n by P = -P/v min = min/v (Pn), and identify TZ = 7^7v min as the most 
interesting region in this final state. 

7. We use a second ensemble of hse's to determine the fraction V of hse's in which P found in 
the hse is smaller than P observed in the data. The most important output of the algorithm 
is this single number V, which may loosely be said to be the "fraction of hypothetical 
similar experiments in which you would see an excess as interesting as what you actually 
saw in the data." V takes on values between zero and unity, with values close to zero 
indicating a possible hint of new physics. The computation of V rigorously takes into 
account the many regions that have been considered within this final state. 

The smallest V found in the many different final states considered (T^mm) determines V, the 
"fraction of hypothetical similar experimental runs (hser's) that would have produced an excess 
as interesting as actually observed in the data," where an hser consists of one hse for each final 
state considered. V is calculated by simulating an ensemble of hypothetical similar experimental 
runs, and noting the fraction of these hser's in which the smallest V found is smaller than the 
smallest V observed in the data. Because V depends only on the single final state that defines 
Pmin, correlations among final states may be neglected in this calculation. Like V, V takes 



on values between zero and unity, and the potential presence of new high pr physics would be 
indicated by finding V to be small. The difference between V and V is that in computing V we 
account for the many final states that have been considered. 

3 Results 

sleuth's performance on representative signatures has been studied BiBi When ignorance of ti 
is feigned in the e/j,X final states, we find V e nM T 2j = 1.9<r in D0 data, correctly suggesting the 
presence of ti. Feigning ignorance of ti in the VF+jets-like final states, we find V m in > 3<7 in 
30% of an ensemble of mock experimental runs on the final states W 3j, W W 5j, and W 6j. 
Dedicated searches for the top quark in these channel!] yield an excess of 2.75<r in efil^T^j, 
2.6a in W4j(nj) with no 6-tag, and 3.6<r in W3j(nj) with a 6-tag. We see that sleuth 
performs surprisingly well despite being denied 6-tagging information and any information of 
the properties of the signal it is supposed to be looking for. These and similar studies performed 
in other final states suggest that sleuth is sensitive to a variety of new physics signatures. 

We have applied sleuth to over thirty exclusive final states at D0, determining V for 
each. Upon taking into account the many final states (both populated and unpopulated) that 
are considered, we find "P=0.89, implying that 89% of an ensemble of hypothetical similar 
experimental runs would have produced a final state with a candidate signal more interesting 
than the most interesting observed in these data. 

4 Conclusions 

We have applied sleuth to search for new high px physics in data spanning over thirty exclusive 
final states collected by the D0 experiment during Run I of the Fermilab Tevatron. A quasi- 
model-independent, systematic search of these data has produced no evidence of physics beyond 
the standard model. The strategy proposed here may prove useful in future searches for new 
phenomena. 
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